LZ78 Compression in Low Main Memory Space

نویسندگان

  • Diego Arroyuelo
  • Rodrigo Cánovas
  • Gonzalo Navarro
  • Rajeev Raman
چکیده

We present the first algorithm that performs the LZ78 compression of a text of length n over alphabet [1..σ], whose output is z integers, using only O(z lg σ) bits of main memory. The algorithm reads the input text from disk in a single pass, and writes the compressed output to disk. The text can also be decompressed within the same main memory usage, which is unprecedented too. The algorithm is based on hashing and, under some simplifying assumptions, it runs in O(n) expected time. This is verified experimentally, together with the superiority of the algorithm with respect to previously implemented LZ78 compressors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Smaller Representation of Finite State Automata

This paper is a follow-up to Jan Daciuk’s experiments on space-efficient finite state automata representation that can be used directly for traversals in main memory [4]. We investigate several techniques of reducing the memory footprint of minimal automata, mainly exploiting the fact that transition labels and transition pointer offset values are not evenly distributed and so are suitable for ...

متن کامل

Space-Conscious Compression

Compression is most important when space is in short supply, so compression algorithms are often implemented in limited memory. Most analyses ignore memory constraints as an implementation detail, however, creating a gap between theory and practice. In this paper we consider the effect of memory limitations on compression algorithms. In the first part we assume the memory available is fixed and...

متن کامل

Space-efficient construction of Lempel-Ziv compressed text indexes

A compressed full-text self-index is a data structure that replaces a text and in addition gives indexed access to it, while taking space proportional to the compressed text size. This is very important nowadays, since one can accommodate the index of very large texts entirely in main memory, avoiding the slower access to secondary storage. In particular, the LZ-index [G. Navarro, Journal of Di...

متن کامل

Speech Data Compression for Embedded Systems

The main concern of this paper is speech data compression for low-cost embedded systems such as voice-related toys or devices with interactive sound-responses. We use a PC to generate and compress 8-bit-speech-data that has various features such as human speech, symphony and animal songs; the compressed data are then transferred to a masked-ROM. An Intel 8051 embedded chip is employed to expand...

متن کامل

IBM Memory Expansion Technology (MXT)

Several state-of-the-art technologies are leveraged to establish an architecture for a low-cost and high performance memory controller and memory system that more than doubles the effective size of the installed main memory without significant added cost. This unique architecture is the first of its kind to employ real-time main memory content compression at a performance competitive with the b...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017